来自计算机断层扫描血管造影(CTA)的肾脏结构分割对于许多计算机辅助的肾脏癌治疗应用至关重要。肾脏解析〜(KIPA 2022)挑战旨在建立细粒度的多结构数据集并改善多个肾脏结构的分割。最近,U-NET主导了医疗图像分割。在KIPA挑战中,我们评估了几个U-NET变体,并选择了最终提交的最佳模型。
translated by 谷歌翻译
基于深度学习(DL)的医学图像分类和细分是诊断当前COVID 19的变异病毒的紧急研究主题。在肺的Covid-19计算机断层扫描(CT)图像中,地面玻璃浊度是需要专业诊断的最常见发现。基于这种情况,一些研究人员提出了相关的DL模型,这些模型可以在缺乏专业知识时取代诊所的专业诊断专家。但是,尽管DL方法在医学图像处理中具有惊人的性能,但有限的数据集可能是发展人类级别诊断准确性的挑战。此外,深度学习算法面临着将三个甚至多个维度分类的医学图像分类和分割的挑战,并保持高精度率。因此,有了确保高水平的准确性,我们的模型可以将患者的CT图像分为三种类型:正常,肺炎和covid。随后,两个数据集用于分割,其中一个数据集甚至只有有限的数据(20例)。我们的系统将分类模型和分割模型结合在一起,建立在RESNET50和3D U-NET算法的基础上。通过使用不同的数据集进行喂食,将根据分类结果进行感染区域的共vid图像分割。我们的模型通过3种类型的肺部病变分类达到94.52%的准确性:卷,肺炎和正常。对于将来的医疗用途,将模型嵌入医疗设施可能是一种有效的方法,可以协助或替代医生诊断,因此,在COVID-19情况下,更广泛的变异病毒问题也可以成功解决。
translated by 谷歌翻译
从密集的气体区域找到稀有气体区域的扩展流体动力学方程仍然是一个很大的挑战。成功的关键是获得准确的构成关系,用于应力和热量通量。最近的数据驱动模型提供了一种从数据学习本构关系的新现象学方法。这种模型使得复杂的本构关系使牛顿粘度和傅里叶的热传导定律扩展,通过更高衍生物的回归。然而,这些模型中的衍生物的选择是ad-hoc,而没有明确的物理解释。我们从理论上调查了数据驱动的模型在线性系统。我们认为这些模型相当于运输系数的非线性长度比例缩放规律。缩放法律的等价证明了物理合理性,并揭示了数据驱动模型的限制。我们的论点还指出,建模缩放法则明确可以避免数据驱动模型中的实际困难,如巨大数据的衍生估计和变量选择。我们进一步提出了一种基于缩放法的构成关系模型,并测试了瑞利散射光谱的计算。结果显示数据驱动的模型在第一次上的Chapman-Enskog扩展和时刻方法具有明显的优势。
translated by 谷歌翻译
This report describes the winning solution to the Robust Vision Challenge (RVC) semantic segmentation track at ECCV 2022. Our method adopts the FAN-B-Hybrid model as the encoder and uses SegFormer as the segmentation framework. The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with a simple dataset balancing strategy. All the original labels are projected to a 256-class unified label space, and the model is trained using a cross-entropy loss. Without significant hyperparameter tuning or any specific loss weighting, our solution ranks the first place on all the testing semantic segmentation benchmarks from multiple domains (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, and WildDash 2). The proposed method can serve as a strong baseline for the multi-domain segmentation task and benefit future works. Code will be available at https://github.com/lambert-x/RVC_Segmentation.
translated by 谷歌翻译
联合学习(FL)是一种使用跨设备分布的数据训练模型的技术。差异隐私(DP)为敏感数据提供了正式的隐私保证。我们的目标是在使用FL和DP保护隐私的同时,在计算受限设备上训练大型神经网络语言模型(NNLM)。但是,随着模型大小的增长,引入模型的DP噪声增加,这通常会阻止收敛。我们提出了部分嵌入更新(PEU),这是一种新颖的技术,可以通过降低有效载荷大小来降低噪声。此外,我们采用低级适应(LORA)和噪声对比估计(NCE)来减少计算受限设备上大型模型的记忆需求。这种技术的组合使得可以在保留准确性和隐私的同时训练大型唱机语言模型。
translated by 谷歌翻译
基于自我关注机制的顶部,视觉变压器最近在各种视觉任务上表现出显着的性能。虽然实现出色的性能,但它们仍然需要相对密集的计算成本,随着斑块的数量,自我关注头和变压器块增加而剧烈缩放。在本文中,我们争辩说,由于图像的变化大,因此它们对贴片之间的长距离依赖性建模的需要不同。为此,我们介绍了一个Adavit,一个自适应计算框架,学习在每次输入的基础上派生在整个骨干内的修补程序,自我注意力头和变压器块的使用策略,旨在提高视觉变压器的推理效率图像识别的最小精度降低。以端到端的方式与变压器骨架一起优化,轻量级决策网络连接到骨架上,以便在飞行中产生决定。关于ImageNet的广泛实验表明,与最先进的视觉变压器相比,我们的方法对效率的提高超过了2倍的效率,只有0.8%的准确性,实现了在不同的计算预算上的良好效率/准确性权衡权衡。我们进一步对学习使用政策进行了定量和定性分析,并对视觉变压器的冗余提供了更多的见解。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
Graph Neural Networks (GNNs) have shown satisfying performance on various graph learning tasks. To achieve better fitting capability, most GNNs are with a large number of parameters, which makes these GNNs computationally expensive. Therefore, it is difficult to deploy them onto edge devices with scarce computational resources, e.g., mobile phones and wearable smart devices. Knowledge Distillation (KD) is a common solution to compress GNNs, where a light-weighted model (i.e., the student model) is encouraged to mimic the behavior of a computationally expensive GNN (i.e., the teacher GNN model). Nevertheless, most existing GNN-based KD methods lack fairness consideration. As a consequence, the student model usually inherits and even exaggerates the bias from the teacher GNN. To handle such a problem, we take initial steps towards fair knowledge distillation for GNNs. Specifically, we first formulate a novel problem of fair knowledge distillation for GNN-based teacher-student frameworks. Then we propose a principled framework named RELIANT to mitigate the bias exhibited by the student model. Notably, the design of RELIANT is decoupled from any specific teacher and student model structures, and thus can be easily adapted to various GNN-based KD frameworks. We perform extensive experiments on multiple real-world datasets, which corroborates that RELIANT achieves less biased GNN knowledge distillation while maintaining high prediction utility.
translated by 谷歌翻译